Identifying Word Senses in Greek Text: A comparison of machine learning methods

نویسندگان

  • A. Grigoriadis
  • G. Paliouras
  • V. Karkaletsis
  • C. D. Spyropoulos
چکیده

In this paper we perform a comparative evaluation of machine learning methods on the task of identifying the correct sense of a word, based on the context in which it appears. This task is known as word sense disambiguation (WSD) and is one of the hardest and most interesting issues in language engineering. Research on the use of machine learning techniques for WSD has so far focused almost exclusively on English words, due to the scarcity of the required linguistic resources for other languages. The work presented here is the first attempt to apply machine learning methods to Greek words. We have constructed a semantically tagged corpus for two Greek words: a noun with clearly distinguishable senses and a verb with overlapping senses. This corpus is used to evaluate four different machine learning methods and three different representations of the context of the ambiguous word. Our results show that the simple naïve Bayesian classifier and a method using Support Vector Machines outperform decision tree induction, even with the use of boosting. Furthermore, the use of a distance-based weighting function for the context of the ambiguous word does not seem to have a substantial effect on the performance of the methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models

Unsupervised word sense disambiguation (WSD) methods are an attractive approach to all-words WSD due to their non-reliance on expensive annotated data. Unsupervised estimates of sense frequency have been shown to be very useful for WSD due to the skewed nature of word sense distributions. This paper presents a fully unsupervised topic modelling-based approach to sense frequency estimation, whic...

متن کامل

A Naïve Bayes Approach for Word Sense Disambiguation

The word sense disambiguation (WSD) is the task ofautomatically selecting the correct sense given a context and it helps in solving many ambiguity problems inherently existing in all natural languages.Statistical Natural Language Processing (NLP),which is based on probabilistic, stochastic and statistical methods, has been used to solve many NLP problems.The Naive Bayes algorithm which is one o...

متن کامل

Automatically Identifying Changes in the Semantic Orientation of Words

The meanings of words are not fixed but in fact undergo change, with new word senses arising and established senses taking on new aspects of meaning or falling out of usage. Two types of semantic change are amelioration and pejoration; in these processes a word sense changes to become more positive or negative, respectively. In this first computational study of amelioration and pejoration we ad...

متن کامل

Good Neighbors Make Good Senses: Exploiting Distributional Similarity for Unsupervised WSD

We present an automatic method for senselabeling of text in an unsupervised manner. The method makes use of distributionally similar words to derive an automatically labeled training set, which is then used to train a standard supervised classifier for distinguishing word senses. Experimental results on the Senseval-2 and Senseval-3 datasets show that our approach yields significant improvement...

متن کامل

Clinical Word Sense Disambiguation with Interactive Search and Classification

Resolving word ambiguity in clinical text is critical for many natural language processing applications. Effective word sense disambiguation (WSD) systems rely on training a machine learning based classifier with abundant clinical text that is accurately annotated, the creation of which can be costly and time-consuming. We describe a double-loop interactive machine learning process, named ReQ-R...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003